Integrate colqwen2.5 using colqwen2 modelling code by sahil-kabir · Pull Request #40600 · huggingface/transformers

sahil-kabir · 2025-09-01T17:15:36Z

Main contributor: @sahil-kabir
Guided by: @PavloFesenko

What does this PR do?

Fixes #39549 Is there plan to integrate ColQwen2.5 into Transformers?

A PR was made on this but was closed because colqwen2.5 could be integrated without making a new class for it. This PR allows weights for colqwen2.5 to be imported using the existing colqwen2 class.

Edit:

Looks like passing in weights for colqwen2.5 and the config for Qwen2.5 works on its own. All I'm adding now is a integration test and documentation.

I made an endpoint with colqwen2.5 weights by merging the adpater weights from vidore/colqwen2.5-v0.2 to Qwen/Qwen2.5-VL-3B-Instruct (As per the documentation on the model card for vidore/colqwen2.5-v0.2) and shooting those weights up to Sahil-Kabir/colqwen2.5-v0.2-hf. And moved some extra config files from vidore/colqwen2.5-v0.2 over toSahil-Kabir/colqwen2.5-v0.2.

from peft import PeftModel
import torch

qwen_backbone = "Qwen/Qwen2.5-VL-3B-Instruct"

base_model = AutoModel.from_pretrained(
    qwen_backbone,
    device_map="cpu",
    dtype=torch.bfloat16,
    trust_remote_code=True,
)
peft_model = PeftModel.from_pretrained(base_model, "vidore/colqwen2.5-v0.2")
merged = peft_model.merge_and_unload()
merged.save_pretrained("merged_model")

I've added an integration test to use my endpoint and show that it works on the basic tests. The test I wrote is passing.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case. here
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Hey @yonigozlan, with minimal code changes, it looks like colqwen2.5 can be imported using the colqwen2 class. By running, !python transformers/src/transformers/models/colqwen2/convert_colqwen2_weights_to_hf.py --model_id cjkasbdkjnlakb/colqwen2.5-v0.2-merged --original_vlm_name_or_path Qwen/Qwen2.5-VL-3B-Instruct --output_dir ./colqwen2_5 --using_qwen2_5 I'm able to get a model that seems to have similar outputs to the 2.5 in the colpali engine repo. @PavloFesenko and I are working on figuring out where one should import the colqwen2.5 adapter weights from, tests, and documentation for now.

…g_code

PavloFesenko

Nice job! 👍️ Please see my remarks below.

CI/CD tests are breaking because double quotes instead of single quotes are used inside f-strings with double quotes. But that error will disappear because we should avoid using f-strings inside loggers in general.

src/transformers/models/colqwen2/configuration_colqwen2.py

src/transformers/models/colqwen2/convert_colqwen2_weights_to_hf.py

src/transformers/models/colqwen2/modeling_colqwen2.py

src/transformers/models/colqwen2/modular_colqwen2.py

Rocketknight1 · 2025-09-02T12:50:04Z

Seems like document retrieval / VLM so cc @zucchini-nlp and maybe @tomaarsen ?

yonigozlan

Hello @sahil-kabir, thanks for contributing! Very nice if we can support colqwen2.5 without needing to add a whole new model 🤗.
Could you add a mention of colqwen2.5 support in the documentation as well? In transformers/docs/source/en/model_doc/colqwen2.md.
It would be great to add an integration test for colqwen2.5 like we have for colqwen2 in transformers/tests/models/colqwen2/test_modeling_colqwen2.py.

yonigozlan · 2025-09-04T14:58:37Z

src/transformers/models/colqwen2/modular_colqwen2.py

+                dtype, device = self._get_dtype_device()
+                pixel_values = pixel_values.to(dtype=dtype, device=device)


Not a big fan of this, it'd be great if we can avoid having use_qwen2_5 in the config. Let's just use the dtype and device of input_embeds

Suggested change

dtype, device = self._get_dtype_device()

pixel_values = pixel_values.to(dtype=dtype, device=device)

pixel_values = pixel_values.to(inputs_embeds.device, inputs_embeds.dtype)

BTW, in qwen-2 vision tower, we cast pixels to correct dtype manually so it is not needed. Also, LM and vision might be loaded with different dtypes and devices in specific cases :)

@zucchini-nlp Did I understand correctly that this line could be removed entirely and it should work anyway?

@sahil-kabir Maybe worth a quick try. 😉

yep, correct

yonigozlan · 2025-09-04T14:59:12Z

src/transformers/models/colqwen2/modular_colqwen2.py

+    def _get_dtype_device(self) -> tuple[str, str]:  
+        if self.config.use_qwen2_5:
+            parameters = next(self.vlm.visual.parameters())  
+        else:  
+            parameters = next(self.parameters())
+        dtype, device = parameters.dtype, parameters.device  
+        return dtype, device
+


No need for that then

Suggested change

def _get_dtype_device(self) -> tuple[str, str]:

if self.config.use_qwen2_5:

parameters = next(self.vlm.visual.parameters())

else:

parameters = next(self.parameters())

dtype, device = parameters.dtype, parameters.device

return dtype, device

src/transformers/models/colqwen2/configuration_colqwen2.py

yonigozlan · 2025-09-04T15:04:23Z

src/transformers/models/colqwen2/convert_colqwen2_weights_to_hf.py

    config = ColQwen2Config(
        vlm_config=original_config,
        embedding_dim=128,  # hardcoded in the original model
+        use_qwen2_5=use_qwen2_5,


Instantiate a qwen2_5 config instead

…g_code

…en2_modelling_code

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

…g_code

…en2_modelling_code

sahil-kabir · 2025-09-13T02:08:30Z

Hey @yonigozlan, integration tests and documentation are updated. I have details on what I did to make this work in the PR description above. Going to undraft this now 👍

…g_code

yonigozlan

Nice!
Very last thing to do would be to add an integration test directly against the original implementation. Other than that, LGTM!
@tonywu71 Do you think we could add the converted checkpoint to the vidore org and here?
Now waiting on @ArthurZucker or @Cyrilvallez final approval to get this merged!

tests/models/colqwen2/test_modeling_colqwen2.py

HuggingFaceDocBuilderDev · 2025-09-15T15:18:12Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

PavloFesenko · 2025-09-16T00:19:42Z

@yonigozlan Should we also add the code and clear instructions for merging adapter weights either to models/colqwen2/convert_colqwen2_weights_to_hf.py or to the model card? I feel that it could save a lot of time in case a new version of ColQwen2.5 is released by the authors since they typically release only the adapter weights which need to be merged to the base model in order to be used by HuggingFace?

Cyrilvallez · 2025-09-17T08:40:58Z

run-slow: colqwen2

github-actions · 2025-09-17T08:42:27Z

This comment contains run-slow, running the specified jobs:

models: ['models/colqwen2']
quantizations: [] ...

Cyrilvallez

LGTM! Thanks a lot, no diff to the original model 🤗
@yonigozlan, feel free to merge once everyone is happy and the tests are green! 🤗

yonigozlan · 2025-09-17T13:44:53Z

@yonigozlan Should we also add the code and clear instructions for merging adapter weights either to models/colqwen2/convert_colqwen2_weights_to_hf.py or to the model card? I feel that it could save a lot of time in case a new version of ColQwen2.5 is released by the authors since they typically release only the adapter weights which need to be merged to the base model in order to be used by HuggingFace?

Yes absolutely! Users should be able to convert the original colqwen2.5 weights to the Transformers weights using convert_colqwen2_weights_to_hf.py. We can maybe add an arg to argparse to specify which model we're converting, or deduce it from the original weights/config if possible.

tonywu71 · 2025-09-17T21:21:05Z

Nice! Very last thing to do would be to add an integration test directly against the original implementation. Other than that, LGTM! @tonywu71 Do you think we could add the converted checkpoint to the vidore org and here? Now waiting on @ArthurZucker or @Cyrilvallez final approval to get this merged!

@antonioloison and @QuentinJGMace have the admin rights on the repository and are carrying the initial work on ColPali, can you check with them please? 🙏🏼

…g_code

…mentation.

sahil-kabir · 2025-09-27T17:45:57Z

@yonigozlan confirmed the endpoint matches the original implementation and tests are added. I don't have access to a GPU with CUDA support (I have cpu and mps) so I set device_map="cpu" in the test. Let me know if you need something different.

…5_using_colqwen2_modelling_code

yonigozlan · 2025-10-03T16:11:19Z

Hey @sahil-kabir ! Sorry for the delay, I just updated the integration tests with the values I get when running on our CI hardware.
Everything looks good to me, only thing missing is this now:

Yes absolutely! Users should be able to convert the original colqwen2.5 weights to the Transformers weights using convert_colqwen2_weights_to_hf.py. We can maybe add an arg to argparse to specify which model we're converting, or deduce it from the original weights/config if possible.

…g_code

yonigozlan · 2025-11-03T22:53:16Z

@bot /style

github-actions · 2025-11-03T22:53:59Z

Style bot fixed some files and pushed the changes.

…g_code

github-actions · 2025-11-03T22:56:02Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: colqwen2

yonigozlan

Thanks for adding the conversion script @sahil-kabir ! Waiting for the CI to pass and I'll merge :)

* adding option for 2.5 * minor - arg in conversion script * getting started on modelling.py * minor - shouldve been using modular * adressing comments + fixing datatype/device _get method * minor * commiting suggestion Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * docs + first test * ruff fix * minor fix * ruff fix * model fix * model fix * fine-grained check, with a hardcoded score from the original Hf implementation. * minor ruff * update tests values with CI hardware * adding 2.5 to conversion script * Apply style fixes --------- Co-authored-by: Sahil Kabir <sahilkabir@Sahils-MacBook-Pro.local> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * Run slow v2 (#41914) * Super * Super * Super * Super --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix `detectron2` installation in docker files (#41975) * detectron2 - part 1 * detectron2 - part 2 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix `autoawq[kernels]` installation in quantization docker file (#41978) fix autoawq[kernels] Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * add support for saving encoder only so any parakeet model can be loaded for inference (#41969) * add support for saving encoder only so any decoder model can be loaded Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * use convolution_bias * convert modular * convolution_bias in convertion script --------- Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * Use indices as position_ids in modernebert (#41789) * Use indices as position_ids in modernebert * Move position_ids init to the branch * test tensor parallel: make tests for dense model more robust (#41968) * make test forward and backward more robust * refactor compile part of test tensor parallel * linting * pass rank around instead of calling it over and over * Run slow v2 (#41914) * Super * Super * Super * Super --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix `detectron2` installation in docker files (#41975) * detectron2 - part 1 * detectron2 - part 2 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix `autoawq[kernels]` installation in quantization docker file (#41978) fix autoawq[kernels] Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * add support for saving encoder only so any parakeet model can be loaded for inference (#41969) * add support for saving encoder only so any decoder model can be loaded Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * use convolution_bias * convert modular * convolution_bias in convertion script --------- Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> --------- Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com> Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * fix: dict[RopeParameters] to dict[str, RopeParameters] (#41963) * docs: add continuous batching page (#41847) * docs: add continuous batching page * docs(cb): add `generate_batch` example * docs(cb): add `opentelemtry` and `serving` section * feat: add `TODO` note about opentelemetry dependency * docs(cb): add supported features * docs(cb): add unsupported features * docs(cb): add `ContinuousBatchingManager` example * docs(cb): x reference CB in optimizing inference * Fix `torchcodec` version in quantization docker file (#41988) check Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [kernels] Add Tests & CI for kernels (#41765) * first commit * add tests * add kernel config * add more tests * add ci * small fix * change branch name * update tests * nit * change test name * revert jobs * addressing review * reenable all jobs * address second review * Move the Mi355 to regular docker (#41989) * Move the Mi355 to regular docker * Disable gfx950 compilation for FA on AMD * More data in benchmarking (#41848) * Reduce scope of cross-generate * Rm generate_sall configs * Workflow benchmarks more * Prevent crash when FA is not installed * fix (CI): Refactor SSH runners (#41991) * Change ssh runner type * Add wait step to SSH runner workflow * Rename wait step to wait2 in ssh-runner.yml * Remove wait step from ssh-runner.yml Removed the wait step from the SSH runner workflow. * Update runner type for single GPU A10 instance * Update SSH runner version to 1.90.3 * Add sha256sum to ssh-runner workflow * Update runner type and remove unused steps * fix 3 failed test cases for video_llama_3 model on Intel XPU (#41931) * fix 3 failed test cases for video_llama_3 model on Intel XPU Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * adjust format Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update code Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> --------- Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * Integrate colqwen2.5 using colqwen2 modelling code (#40600) * adding option for 2.5 * minor - arg in conversion script * getting started on modelling.py * minor - shouldve been using modular * adressing comments + fixing datatype/device _get method * minor * commiting suggestion Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * docs + first test * ruff fix * minor fix * ruff fix * model fix * model fix * fine-grained check, with a hardcoded score from the original Hf implementation. * minor ruff * update tests values with CI hardware * adding 2.5 to conversion script * Apply style fixes --------- Co-authored-by: Sahil Kabir <sahilkabir@Sahils-MacBook-Pro.local> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Fixed wrong padding value in OWLv2 (#41938) * Update image_processing_owlv2_fast.py fixed padding value * fixed padding value * Change padding constant value from 0.5 to 0.0 * Fixed missed padding value in modular_owlv2.py --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Fix `run slow v2`: empty report when there is only one model (#42002) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [kernels] change import time in KernelConfig (#42004) * change import time * style * DOC Fix typo in argument name: pseudoquant (#41994) The correct argument name is pseudoquantization. Since there is no error on passing wrong arguments name (which is arguably an anti-pattern), this is difficult for users to debug. * Fix `torch+deepspeed` docker file (#41985) * fix * delete --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Correct syntax error in trainer.md (#42001) A comma is missing between two parameters in the signature of compute_loss function. * Reduce the number of benchmark in the CI (#42008) Changed how benchmark cfgs are chosen * Fix continuous batching tests (#42012) * Fix continuous batching tests * make fixup * add back `logging_dir` (#42013) * add back * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Fix issue with from pretrained and kwargs in image processors (#41997) * accept kwargs in image proc from_pretrained * only use kwargs that are in cls.valid_kwargs * remove specific logic for _from_auto * add image_seq_length to Images_kwargs for backward compatibility * fix missing image kwargs in pix2struct * Fix default image_rows and image_cols initialization in Idefics3 and SmolVLM processors (#41871) * Fix default image_rows and image_cols initialization in Idefics3 and SmolVLM processors * Fix default initialization of image_rows and image_cols in Idefics3 and SmolVLM processors * Add GLPNImageProcessorFast (#41725) * Add GLPNImageProcessorFast for torch backend * Address review feedback - Simplified to_dict() method - Keep tensors as torch instead of converting to numpy for heterogeneous shapes - Removed unnecessary shape guards in post_process_depth_estimation - Improved variable names (tgt -> target_size, d -> resized) - Removed unnecessary GLPNImageProcessorKwargs class * Address review feedback - Simplified to_dict() method - Keep tensors as torch instead of converting to numpy for heterogeneous shapes - Removed unnecessary shape guards in post_process_depth_estimation - Improved variable names (tgt -> target_size, d -> resized) - Removed unnecessary GLPNImageProcessorKwargs class * commits after 2nd review * Address all review feedback and add explicit batched test - Simplified to_dict() with descriptive variable names (d->output_dict) - Fixed resize operation: changed from crop to proper resize with interpolation - Added padding for heterogeneous batch shapes in both slow and fast processors - Fused rescale and normalize operations for efficiency - Improved all variable names (tgt->target_size, d->depth_4d->resized) - Added GLPNImageProcessorKwargs class in slow processor and imported in fast - Renamed test_equivalence_slow_fast to test_slow_fast_equivalence - Added explicit test_slow_fast_equivalence_batched test - All 20 tests passing * using padding from utils * simplify glpn image processor fast * fix docstring --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * add fuyu fast image processors (#41817) * added fast processor for fuyu (#36978) * updated docs for fuyu model (#36978) * updated test_image_processing and image_processing_fuyu_fast * updated fuyu.md and image_processing_fuyu_fast (#36978) * updated test_image_processing_fuyu (#36978) * formatted image_processing_fuyu_fast and test_image_processing_fuyu (#36978) * updated tests and fuyu fast image processing (#36978) * Merge branch 'fuyu-fast-image-processors' of https://github.com/DeXtAr47-oss/transformers into fuyu-fast-image-processors * fixed format (#36978) * formatted files (#36978) * formatted files * revert unnecessary changes * clean up and process by group --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> * [kernels] Fix XPU layernorm kernel (#41583) * fix * add comment * better fix * style * Update src/transformers/modeling_utils.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * [v5] Deprecate Text2Text and related pipelines (#41996) * Deprecate Text2Text and related pipelines * Try a restructure * make fixup * logging -> logger * [FPQuant] MXFP8 and MXFP4 backwards support (#41897) * FP-Quant backwards * fp-quant v0.3.0 docker * availability version bump * fp_quant==0.3.1 * fp_quant v0.3.2 * add working auto_docstring for processors * add auto_docstring to processors first part * add auto_docstring to processors part 2 * modifs after review * fully working auto_docstring and check_docstring with placeholder docstrings * Working check_docstrings for Typed dicts * Add recurring processor args to auto_docstring and add support for removing redundant docstring and placeholders * replace placeholders with real docstrings * fix copies * fixup * remove unwanted changes * fix unprotected imports * Fix unprotected imports * fix unprotected imports * Add __call__ to all docs of processors * nits docs --------- Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com> Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> Co-authored-by: Rémi Ouazan <83456801+remi-or@users.noreply.github.com> Co-authored-by: Ferdinand Mom <47445085+3outeille@users.noreply.github.com> Co-authored-by: Ryan Mullins <ryanmullins@google.com> Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Guillaume LEGENDRE <glegendre01@gmail.com> Co-authored-by: kaixuanliu <kaixuan.liu@intel.com> Co-authored-by: Sahil Kabir <66221472+sahil-kabir@users.noreply.github.com> Co-authored-by: Sahil Kabir <sahilkabir@Sahils-MacBook-Pro.local> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: James <67161633+gjamesgoenawan@users.noreply.github.com> Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> Co-authored-by: Yacklin Wong <139425274+Yacklin@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: MilkClouds <claude@maum.ai> Co-authored-by: ARAVINDHAN T <arvindhant01@gmail.com> Co-authored-by: Pritam Das <79273068+DeXtAr47-oss@users.noreply.github.com> Co-authored-by: Andrei Panferov <andrei@panferov.org>

* adding option for 2.5 * minor - arg in conversion script * getting started on modelling.py * minor - shouldve been using modular * adressing comments + fixing datatype/device _get method * minor * commiting suggestion Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * docs + first test * ruff fix * minor fix * ruff fix * model fix * model fix * fine-grained check, with a hardcoded score from the original Hf implementation. * minor ruff * update tests values with CI hardware * adding 2.5 to conversion script * Apply style fixes --------- Co-authored-by: Sahil Kabir <sahilkabir@Sahils-MacBook-Pro.local> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * Run slow v2 (huggingface#41914) * Super * Super * Super * Super --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix `detectron2` installation in docker files (huggingface#41975) * detectron2 - part 1 * detectron2 - part 2 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix `autoawq[kernels]` installation in quantization docker file (huggingface#41978) fix autoawq[kernels] Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * add support for saving encoder only so any parakeet model can be loaded for inference (huggingface#41969) * add support for saving encoder only so any decoder model can be loaded Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * use convolution_bias * convert modular * convolution_bias in convertion script --------- Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * Use indices as position_ids in modernebert (huggingface#41789) * Use indices as position_ids in modernebert * Move position_ids init to the branch * test tensor parallel: make tests for dense model more robust (huggingface#41968) * make test forward and backward more robust * refactor compile part of test tensor parallel * linting * pass rank around instead of calling it over and over * Run slow v2 (huggingface#41914) * Super * Super * Super * Super --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix `detectron2` installation in docker files (huggingface#41975) * detectron2 - part 1 * detectron2 - part 2 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix `autoawq[kernels]` installation in quantization docker file (huggingface#41978) fix autoawq[kernels] Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * add support for saving encoder only so any parakeet model can be loaded for inference (huggingface#41969) * add support for saving encoder only so any decoder model can be loaded Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * use convolution_bias * convert modular * convolution_bias in convertion script --------- Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> --------- Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com> Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * fix: dict[RopeParameters] to dict[str, RopeParameters] (huggingface#41963) * docs: add continuous batching page (huggingface#41847) * docs: add continuous batching page * docs(cb): add `generate_batch` example * docs(cb): add `opentelemtry` and `serving` section * feat: add `TODO` note about opentelemetry dependency * docs(cb): add supported features * docs(cb): add unsupported features * docs(cb): add `ContinuousBatchingManager` example * docs(cb): x reference CB in optimizing inference * Fix `torchcodec` version in quantization docker file (huggingface#41988) check Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [kernels] Add Tests & CI for kernels (huggingface#41765) * first commit * add tests * add kernel config * add more tests * add ci * small fix * change branch name * update tests * nit * change test name * revert jobs * addressing review * reenable all jobs * address second review * Move the Mi355 to regular docker (huggingface#41989) * Move the Mi355 to regular docker * Disable gfx950 compilation for FA on AMD * More data in benchmarking (huggingface#41848) * Reduce scope of cross-generate * Rm generate_sall configs * Workflow benchmarks more * Prevent crash when FA is not installed * fix (CI): Refactor SSH runners (huggingface#41991) * Change ssh runner type * Add wait step to SSH runner workflow * Rename wait step to wait2 in ssh-runner.yml * Remove wait step from ssh-runner.yml Removed the wait step from the SSH runner workflow. * Update runner type for single GPU A10 instance * Update SSH runner version to 1.90.3 * Add sha256sum to ssh-runner workflow * Update runner type and remove unused steps * fix 3 failed test cases for video_llama_3 model on Intel XPU (huggingface#41931) * fix 3 failed test cases for video_llama_3 model on Intel XPU Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * adjust format Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update code Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> --------- Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * Integrate colqwen2.5 using colqwen2 modelling code (huggingface#40600) * adding option for 2.5 * minor - arg in conversion script * getting started on modelling.py * minor - shouldve been using modular * adressing comments + fixing datatype/device _get method * minor * commiting suggestion Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * docs + first test * ruff fix * minor fix * ruff fix * model fix * model fix * fine-grained check, with a hardcoded score from the original Hf implementation. * minor ruff * update tests values with CI hardware * adding 2.5 to conversion script * Apply style fixes --------- Co-authored-by: Sahil Kabir <sahilkabir@Sahils-MacBook-Pro.local> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Fixed wrong padding value in OWLv2 (huggingface#41938) * Update image_processing_owlv2_fast.py fixed padding value * fixed padding value * Change padding constant value from 0.5 to 0.0 * Fixed missed padding value in modular_owlv2.py --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Fix `run slow v2`: empty report when there is only one model (huggingface#42002) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [kernels] change import time in KernelConfig (huggingface#42004) * change import time * style * DOC Fix typo in argument name: pseudoquant (huggingface#41994) The correct argument name is pseudoquantization. Since there is no error on passing wrong arguments name (which is arguably an anti-pattern), this is difficult for users to debug. * Fix `torch+deepspeed` docker file (huggingface#41985) * fix * delete --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Correct syntax error in trainer.md (huggingface#42001) A comma is missing between two parameters in the signature of compute_loss function. * Reduce the number of benchmark in the CI (huggingface#42008) Changed how benchmark cfgs are chosen * Fix continuous batching tests (huggingface#42012) * Fix continuous batching tests * make fixup * add back `logging_dir` (huggingface#42013) * add back * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Fix issue with from pretrained and kwargs in image processors (huggingface#41997) * accept kwargs in image proc from_pretrained * only use kwargs that are in cls.valid_kwargs * remove specific logic for _from_auto * add image_seq_length to Images_kwargs for backward compatibility * fix missing image kwargs in pix2struct * Fix default image_rows and image_cols initialization in Idefics3 and SmolVLM processors (huggingface#41871) * Fix default image_rows and image_cols initialization in Idefics3 and SmolVLM processors * Fix default initialization of image_rows and image_cols in Idefics3 and SmolVLM processors * Add GLPNImageProcessorFast (huggingface#41725) * Add GLPNImageProcessorFast for torch backend * Address review feedback - Simplified to_dict() method - Keep tensors as torch instead of converting to numpy for heterogeneous shapes - Removed unnecessary shape guards in post_process_depth_estimation - Improved variable names (tgt -> target_size, d -> resized) - Removed unnecessary GLPNImageProcessorKwargs class * Address review feedback - Simplified to_dict() method - Keep tensors as torch instead of converting to numpy for heterogeneous shapes - Removed unnecessary shape guards in post_process_depth_estimation - Improved variable names (tgt -> target_size, d -> resized) - Removed unnecessary GLPNImageProcessorKwargs class * commits after 2nd review * Address all review feedback and add explicit batched test - Simplified to_dict() with descriptive variable names (d->output_dict) - Fixed resize operation: changed from crop to proper resize with interpolation - Added padding for heterogeneous batch shapes in both slow and fast processors - Fused rescale and normalize operations for efficiency - Improved all variable names (tgt->target_size, d->depth_4d->resized) - Added GLPNImageProcessorKwargs class in slow processor and imported in fast - Renamed test_equivalence_slow_fast to test_slow_fast_equivalence - Added explicit test_slow_fast_equivalence_batched test - All 20 tests passing * using padding from utils * simplify glpn image processor fast * fix docstring --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * add fuyu fast image processors (huggingface#41817) * added fast processor for fuyu (huggingface#36978) * updated docs for fuyu model (huggingface#36978) * updated test_image_processing and image_processing_fuyu_fast * updated fuyu.md and image_processing_fuyu_fast (huggingface#36978) * updated test_image_processing_fuyu (huggingface#36978) * formatted image_processing_fuyu_fast and test_image_processing_fuyu (huggingface#36978) * updated tests and fuyu fast image processing (huggingface#36978) * Merge branch 'fuyu-fast-image-processors' of https://github.com/DeXtAr47-oss/transformers into fuyu-fast-image-processors * fixed format (huggingface#36978) * formatted files (huggingface#36978) * formatted files * revert unnecessary changes * clean up and process by group --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> * [kernels] Fix XPU layernorm kernel (huggingface#41583) * fix * add comment * better fix * style * Update src/transformers/modeling_utils.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * [v5] Deprecate Text2Text and related pipelines (huggingface#41996) * Deprecate Text2Text and related pipelines * Try a restructure * make fixup * logging -> logger * [FPQuant] MXFP8 and MXFP4 backwards support (huggingface#41897) * FP-Quant backwards * fp-quant v0.3.0 docker * availability version bump * fp_quant==0.3.1 * fp_quant v0.3.2 * add working auto_docstring for processors * add auto_docstring to processors first part * add auto_docstring to processors part 2 * modifs after review * fully working auto_docstring and check_docstring with placeholder docstrings * Working check_docstrings for Typed dicts * Add recurring processor args to auto_docstring and add support for removing redundant docstring and placeholders * replace placeholders with real docstrings * fix copies * fixup * remove unwanted changes * fix unprotected imports * Fix unprotected imports * fix unprotected imports * Add __call__ to all docs of processors * nits docs --------- Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com> Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> Co-authored-by: Rémi Ouazan <83456801+remi-or@users.noreply.github.com> Co-authored-by: Ferdinand Mom <47445085+3outeille@users.noreply.github.com> Co-authored-by: Ryan Mullins <ryanmullins@google.com> Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Guillaume LEGENDRE <glegendre01@gmail.com> Co-authored-by: kaixuanliu <kaixuan.liu@intel.com> Co-authored-by: Sahil Kabir <66221472+sahil-kabir@users.noreply.github.com> Co-authored-by: Sahil Kabir <sahilkabir@Sahils-MacBook-Pro.local> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: James <67161633+gjamesgoenawan@users.noreply.github.com> Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> Co-authored-by: Yacklin Wong <139425274+Yacklin@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: MilkClouds <claude@maum.ai> Co-authored-by: ARAVINDHAN T <arvindhant01@gmail.com> Co-authored-by: Pritam Das <79273068+DeXtAr47-oss@users.noreply.github.com> Co-authored-by: Andrei Panferov <andrei@panferov.org>

Sahil Kabir and others added 5 commits August 31, 2025 14:52

adding option for 2.5

d4a79ce

minor - arg in conversion script

6b8f487

getting started on modelling.py

d0697ce

minor - shouldve been using modular

26b8cbf

Merge branch 'main' into integrate_colqwen2.5_using_colqwen2_modellin…

f5299d2

…g_code

PavloFesenko suggested changes Sep 1, 2025

View reviewed changes

Sahil Kabir added 2 commits September 1, 2025 18:57

adressing comments + fixing datatype/device _get method

14d96d3

minor

3ac06f9

yonigozlan reviewed Sep 4, 2025

View reviewed changes

sahil-kabir and others added 13 commits September 4, 2025 19:17

Merge branch 'main' into integrate_colqwen2.5_using_colqwen2_modellin…

8bc5d73

…g_code

Merge branch 'huggingface:main' into integrate_colqwen2.5_using_colqw…

0656367

…en2_modelling_code

commiting suggestion

73b029b

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

docs + first test

3aa8aa8

Merge branch 'main' into integrate_colqwen2.5_using_colqwen2_modellin…

d4be146

…g_code

ruff fix

f591764

Merge branch 'main' into integrate_colqwen2.5_using_colqwen2_modellin…

9577aae

…g_code

minor fix

e9ea6b6

ruff fix

6ae49f6

model fix

9297f9e

Merge branch 'main' into integrate_colqwen2.5_using_colqwen2_modellin…

6a62d82

…g_code

Merge branch 'huggingface:main' into integrate_colqwen2.5_using_colqw…

2032bd5

…en2_modelling_code

model fix

a0a6245

Merge branch 'main' into integrate_colqwen2.5_using_colqwen2_modellin…

db2df86

…g_code

sahil-kabir marked this pull request as ready for review September 13, 2025 02:17

yonigozlan approved these changes Sep 15, 2025

View reviewed changes

tests/models/colqwen2/test_modeling_colqwen2.py Show resolved Hide resolved

sahil-kabir requested a review from PavloFesenko September 16, 2025 01:51

Cyrilvallez approved these changes Sep 17, 2025

View reviewed changes

sahil-kabir and others added 3 commits September 27, 2025 13:16

Merge branch 'main' into integrate_colqwen2.5_using_colqwen2_modellin…

272a7dc

…g_code

fine-grained check, with a hardcoded score from the original Hf imple…

5ca07ce

…mentation.

minor ruff

961fb9f

yonigozlan added 2 commits October 3, 2025 15:49

Merge remote-tracking branch 'upstream/main' into integrate_colqwen2.…

b6b454e

…5_using_colqwen2_modelling_code

update tests values with CI hardware

76238d3

Sahil Kabir and others added 2 commits October 18, 2025 23:24

adding 2.5 to conversion script

0582b59

Merge branch 'main' into integrate_colqwen2.5_using_colqwen2_modellin…

30dc9d9

…g_code

github-actions bot and others added 2 commits November 3, 2025 22:54

Apply style fixes

26fe35c

Merge branch 'main' into integrate_colqwen2.5_using_colqwen2_modellin…

673289b

…g_code

yonigozlan approved these changes Nov 3, 2025

View reviewed changes

yonigozlan merged commit cd30961 into huggingface:main Nov 3, 2025
17 checks passed

		dtype, device = self._get_dtype_device()
		pixel_values = pixel_values.to(dtype=dtype, device=device)

	dtype, device = self._get_dtype_device()
	pixel_values = pixel_values.to(dtype=dtype, device=device)
	pixel_values = pixel_values.to(inputs_embeds.device, inputs_embeds.dtype)

Conversation

sahil-kabir commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

PavloFesenko left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Rocketknight1 commented Sep 2, 2025

Uh oh!

yonigozlan left a comment

Choose a reason for hiding this comment

Uh oh!

yonigozlan Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

PavloFesenko Sep 6, 2025

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Sep 7, 2025

Choose a reason for hiding this comment

Uh oh!

yonigozlan Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

yonigozlan Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

sahil-kabir commented Sep 13, 2025

Uh oh!

yonigozlan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Sep 15, 2025

Uh oh!

PavloFesenko commented Sep 16, 2025

Uh oh!

Cyrilvallez commented Sep 17, 2025

Uh oh!

github-actions bot commented Sep 17, 2025

Uh oh!

Cyrilvallez left a comment

Choose a reason for hiding this comment

Uh oh!

yonigozlan commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tonywu71 commented Sep 17, 2025

Uh oh!

sahil-kabir commented Sep 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yonigozlan commented Oct 3, 2025

Uh oh!

yonigozlan commented Nov 3, 2025

Uh oh!

github-actions bot commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 3, 2025

Uh oh!

yonigozlan left a comment

sahil-kabir commented Sep 1, 2025 •

edited

Loading

PavloFesenko left a comment •

edited

Loading

yonigozlan commented Sep 17, 2025 •

edited

Loading

sahil-kabir commented Sep 27, 2025 •

edited

Loading

github-actions bot commented Nov 3, 2025 •

edited

Loading